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A RECURSIVE MAXIMUM LIKELIHOOD 

DECODING 



The Viterbi algorithm is indeed a very simple and efficient method of imple- 
menting the maximum likelihood decoding. However, if we take advantage 
of the structural properties in a trellis section, other efficient trellis-based de- 
coding algorithms can be devised. Recently, an efficient trellis-based recur- 
sive maximum likelihood decoding (R.MLD) algorithm for linear block 
codes has been proposed [37], This algorithm is more efficient than the con- 
ventional Viterbi algorithm in both computation and hardware requirements. 
Most importantly, the implementation of this algorithm does not require the 
construction of the entire code trellis, only some special one-section trellises 
of relatively small state and branch complexities are needed for constructing 
path (or branch) metric tables recursively. At the end, there is only one table 
which contains only the most likely codeword and its metric for a given received 
sequence r = (r lt r 2 , . . . , r N ). This algorithm basically uses the divide and 
conquer strategy. Furthermore, it allows parallel/pipeline processing of 
received sequences to speed up decoding. 
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11.1 BASIC CONCEPTS 

Consider a binary (N ) K) linear block code C. Suppose a codeword is transmit- 
ted and r = (ri, r 2 , . . . ,r/y ) is the received vector at the output of the matched 
filter of the receiver. 

Let T be the minimal trellis diagram for C. Consider the trellis section from 
time-x to time-y. As shown in Section 6.2, a composite branch between two 
adjacent states in this trellis section is a coset in Pz,y{C)/C^ v , and a com- 
posite branch may appear many times as shown in (6.13). Using this fact, we 
can reduce the decoding complexity by just processing the distinct composite 
branches in each trellis section. To achieve this, we form a table for the met- 
rics of composite branches, which for each coset D in Pz y y{C)/C^ yy stores the 
largest metric D y denoted m(D), and the label for the branch with the largest 
metric, denoted 1{D). This table is called the composite branch metric ta- 
ble, denoted CBT x<y , for the trellis section between time-x and time-y. Since 
the set of cosets Po,n(C)/Cq t N = CjC consists of C only, the table CBTo./v 
contains only the codeword in C that has the largest metric. This is the most 
likely codeword. The RMLD algorithm is simply an algorithm to construct a 
composite branch metric table recursively from tables for trellis sections of 
shorter lengths to reduce computational complexity. When the table CBT ft(( /v 
is constructed, the decoding is completed and CBTo.at contains the decoded 
codeword. 

A straightforward method to construct the table CBT x<y is to compute the 
metrics of all the vectors in the punctured code p*, y (C), and then find the 
vector with the largest metric for every coset in p z%y (C)/C^ T y by comparing 
the metrics of vectors in the coset. This method is efficient only when y - x 
is small and should only be used at the bottom (or the beginning) of the 
recursive construction procedure. When y — x is large, CBT x>y is constructed 
from CBT IiZ and CBT z , y for a properly chosen integer z with x < z < y. 

Therefore, the key part of the RMLD algorithm is to construct the metric 
table CBT x>y from tables CBT I)Z and CBT z , y . Fir ;t we must show that this can 
be done. For two adjacent states, a x and a yy with r x E £ X (C) and a v E £ y (C), 
let 


E 2 (<Tx,< 7 y) = 


( 11 . 1 ) 
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Figure 11.1. Connection between two states. 

denote the subset of states in E,(C) through which the paths in L{cr x ,a y ) 
connect a x to <r y as shown in Figure 11.1. Then 

L{a x ,a y ) = tj L{<r x ,tr { z l) )o L{<T { ;\a y ). (11-2) 

It follows from (11.2) and the definitions of metric and label of a coset (or a 
composite branch) that we have 

m(L(< 7 *,< 7 y )) — max {m(L(o , „<7i* ) )) + ”»(^(^i ,) . <T y ))} ( 113 ) 

tri' 1 €£«(».,»») 

and 

l{L{a x ,<r v )) = i(L(<T Il( ri i »“« ) ))oi(L( < ri i -** ) ,<T y )) ) (11-4) 

where t max is the index for which the sum in (11.3) takes its maximum. 
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Note that the metrics, m(L(<r* 1 o p i t) )) and m{L{c^\<T z )) y and labels, 
l(L(a Zi ai^)) and l{L(<r[ l \a y )), are stored in the composite branch metric ta- 
bles CBT x , y and CBT* iy . The state set E*(<7 x ,<r y ) can be determined from 
the code trellis. Therefore, (11.3) and (11.4) show that the composite branch 
metric table CBT z , y can be constructed from CBT r , z and CBT z>y . 

Based on the structured properties of a sectionalized trellis, we can readily 
show that 

p = |E z{<Tz,<r y )\ = (11.5) 

This says that if we compute the metric m(L(a X) cr y )) from (11.3) using tables, 

CBT ZiZ and CBT z>y , we need to perform |E z (<r z ,<r y )| additions and 
|S z (cr x ,o’ y )| - 1 comparisons. However, if we compute the metric m(L(a x , <7 y )) 
directly from the parallel branches in L(a z ,<r y ), we need to compute = 

2 branch metrics and perform 2 fc(C *^ — 1 comparisons. For large y - x, 
is much larger than |E*(<r z ,a y )| and hence constructing the metric ta- 
ble CBT x , y from tables, CBT XZ and CBT z , y , requires much less additions and 
comparisons than the direct construction of CBT x , y from vectors in p x , y (C) and 
cosets in p x . y (C)/C x ^, / . Therefore, recursive construction of composite branch 
metric tables for trellis sections of longer lengths from tables for trellis sections 
of shorter lengths reduces decoding computational complexity. 

11.2 THE GENERAL ALGORITHM 

Now we describe the general framework of the RMLD algorithm for construct- 
ing the composite branch metric table CBT r , y r or decoding a received se- 
quence r. We denote this algorithm with RMLD(x,y). This algorithm uses 
two procedures, denoted MakeCBT(x,y) and CcmbCBT(x, y; z), which are 
defined a s follows: 

■ MakeCBT(x,y): construct the table CBT Zty di ectly as described later. 

■ CombCBT (x,y;z): Given tables CBT X>Z and CBT z , y as inputs, where x < 
z < y, combine these tables to form CBT z>y as shown in (11.3) and (11.4). 

The procedure CombCBT(x,y; z) can be expressed as 


CombCBT(RMLD(x, z), RML D(z, y)) 
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Figure 11.2. Illustration of the recursion process of the RMLD algorithm. 

to show its recursive nature. 

[Algorithm RMLD(x,y)] 

Construct CBT*.,, using the least complex of the following two options: 

(1) Execute MakeCBT(x, y), or 

(2) Execute CombCBT(RMLD(x, z),RMLD(z,y)), where z with x < z < y 
is selected to minimize computational complexity. 

Decoding is accomplished by executing RMLD(0, AT). The recursion process 
is depicted in Figure 11.2. We see that the RMLD algorithm allows paral- 
lel/pipeline processing of received words. This speeds up the decoding process. 

The MakeCBT(x, y) procedure is efficient only when y-x is small and should 
only be used at the bottom (or the beginning) of the recursive construction 
procedure. When y-x is large, CBT IiV is constructed from CBT r , z and CBT z , y 
for a properly chosen 2 with x < 2 < y. At the bottom of the recursion process, 
y - x is small and the computation done by the MakeCBT procedure during 
the entire decoding process is also small. Therefore, the major computation is 
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carried out by the CombCBT procedure. Hence, the CombCBT procedure is 
the major procedure in the RMLD algorithm and should be devised to reduce 
either the total number of computations for software implementation or the 
circuit requirement and chip size for IC implementation. 

In a soft-decision decoding algorithm, addition and comparison operations 
for metrics are considered as the basic operations. An addition operation and 
a comparison operation sire in general assumed to have equal weight (or cost). 

Let ip\f{x,y) and i/) C (x,y; 2 ) denote the number of basic operations required 
to execute the procedure MakeCBT(x, y) and the procedure CombCBT(i, y; z), 
respectively. The values of xp\i{x,y) and Vi z) depend on the imple- 

mentation of the RMLD algorithm. Assume that the formulas for i> M {x, y) 
and V> c (x,y;z) are given. To minimize the overall decoding complexity of the 
RMLD algorithm, sectionalization of a trellis (choices of z) must be done prop- 
erly. A sectionalization which gives the smallest overall decoding complexity 
for given rp M {x,y) and 1 p c {x,y\z) is called the optimum sectionalization 
for the code. 

Let V’miu(z.y) denote the smallest number of operations required to con- 
struct the table CBT r , y . Then it follows from the algorithm RMLD(x, y) given 
above that 

j ^M(x,y ) , if 1 + 1 = y, 

tfcnln(z.y) = | m i n /^ M (x,y), mm {^n. t „ in (*,y;z)}J, otherwise, 

1 1 I<1<!/ ( 11 . 6 ) 

where 

V>*.,„i„(x, y;z) =rp m i a {x,z)+xp mia {z,y) + ipc{x,y\z). (11.7) 

The total number of operations required to decode a received word is given by 
V , min(®> ^0* 

By using (11.6) and (11.7) together with formulas for \J> M (x,y) and $c{x,y\ z), 
we can compute Vw (*,y) for every (x,y) with 0 < x <y < N efficiently in 
the following way: The values of + 1) for 0 < x < N are computed 

using the given formula for *Pm{x ,y). For an integer t with 0 < x < x + i < N, 
VWm(z, x + i) can be computed from Vw(*',y') with y' - *' < * and the given 
formulas for rp M {x,y) and 1 /> c (*,y; z)- By keeping track of the values of z se- 
lected in the above procedure, it is easy to find an optimum sectionalization. 
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If xl> M {x y y) and V>c(z,y;z) are independent of the received sequence r for any 
0 < x < y < N and x < z < y, then the optimum sectionalization can be fixed. 

11.3 DIRECT METHODS FOR CONSTRUCTING COMPOSITE 
BRANCH METRIC TABLES 

For two integers x and y such that 0<x<y<AT, a straightforward way to 
construct the composite branch metric table CBT* tV directly is to compute the 
metrics of all the vectors in the punctured codep x>y (C) independently, and then 
find the vector (branch) with the largest metric for every coset in Px t y{C)/C % * y 
by comparing the metrics of vectors in the coset. Bach surviving vector and 
its metric are stored in the table CBT* y . Let MakeCBT-I(x,y) denote this 
procedure. 

The number of addition-equivalent operations required to construct the table 
CBT*. y by executing MakeCBT-I(x,y), denoted (*,y), is given as follows: 

i/?ay (x,y) = (y - x, - + 2 ^ 1>x - 1 ). ( 11 . 8 ) 

The first term is the number of additions to compute all the metrics for the 
vectors in p I<?; (C), and the second term is the number of comparisons for finding 
the vectors with the largest metrics by comparing the metrics of vectors in each 
coset in p x , y (C)/C^ y . 

A more efficient method for constructing the table CBT rs!/ is to compute 
the metrics of the 2 V ~ Z branch labels following the order of the Gray code as 
proposed in [60, 102]. Let MakeCBT-G(x,y) denote this procedure, where 
G stands for Gray code. Assume that the bit metric satisfies the following 
condition: M(r, 0) = — Af(r, 1), where r is a received symbol. This condition 
holds for the AWGN channel with BPSK transmission and M(r, 1) = r. We 
also assume that the all-one vector of length y — c, denoted l y _ x , is in p x y (C) 
for any x and y with 0 < x < y < N. In thu case, the metrics of 2 y ~ x ~ 1 
labels are computed first in the order of the Gray code, and then the remaining 
metrics are computed by negating the first 2 y ~ x metrics. If l y _ x 6 G^ r y , for 
any vector in a coset of C^ y , the complementary vector is in the same coset. 
In this case, we can simply discard the branches with negative metrics [60, 102] 
for finding the largest metric in each coset. 
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Let denote the values of tl>M(x,y) using the MakeCBT-G(x,y). 

Assume that the negation is costless. Then, 

f 2 y ~*“ l + y — x — 2 + — 1), 

if l y _ x € C\\ y , 

2^-x-i y _ x — 2 -f — 1), 

otherwise. 




(11.9) 

For small y - x, the dimension of p x , y (C) is close to y - x. The computational 
complexities of both MakeCBT-I and MakeCBT-G procedures are small for 
small y - x. The RMLD algorithm with the MakeCBT-G procedure requires 
slightly less computational complexity than that with the MakeCBT-I proce- 
dure; however, the MakeCBT-I procedure is simpler for IC implementation. 
Using the MakeCBT-G procedure, the metrics of the first 2 y “*~ 1 labels in 
p zy (C) must be computed serially, however with the MakeCBT-I procedure, 
the metrics for all the labels in p XtV (C) can be computed independently in 
parallel. 


11.4 THE COMBCBT PROCEDURE 

The CombCBT(x, y; z) procedure simply performs the computation of (11.3) 
and finds the label of (11.4). It is important to note that in the construction 
of the metric table CBT x<y , we do not need to compute the metric 

m(cr x ,<T,j) = m(L(<T I ,<r y )) 

for every adjacent state pair (<r x ,a y ). We only need to compute m{a x ,c y ) for 
those adjacent state pairs for which the paths between each state pair form 
a distinct coset in p £ ^ y (C)/C xy . Therefore, we only compute the metrics for 
2 fc (p*.n( c, ))“ fc ( c '*.v) distinct adjacent state pairs between time- x and time-y. This 
is the key to reduce computational complexity. 

In principle we can construct the metric table CBT I<y using the section of 
the code trellis T from time-x to time-y as follows: 

(i) For each coset D € Px, y (C , )/C‘ r y , identify a state pair (<7 x ,<r y ) such that 
L{a xy a y ) = D\ 

(ii) Determine the state set E 2 (<r r ,cr y ); and 
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(iii) Compute the metric m(<r x ,a y ) and the label !(< T s ,a y ) from (11.3) and 
(11.4), respectively. 

However for long codes, it is a big effort to construct the large trellis section from 
time-x to time-y and execute the above steps (i) to (iii). The total number of 
composite branches in the trellis section between time-x and time-y can be very 
large and the number of distinct composite branches is only a small fraction. 
Examining this trellis section can be very time consuming and effort wasting. 
Consequently, implementation will be complex and costly. 

To overcome the complexity problem and facilitate the computation of (1 1.3), 
we construct a much simpler special two-section trellis for the punctured code 
Pl V (C) with section boundary locations in {x,z,y} and multiple “final” states 
at time-y, one for each coset in p x , y { c )f C Xt y This special two-section trellis 
contains only the needed information for constructing the metric table CBT z y 
from CBT*,, and CBT*, S (no redundancy). For a coset D v € Px, v (C)/C xy , 
define 

S z (D y ) 4 { Dz 6 p,.,(C)/C£ t : D; C ?,.,(£>„)}, (11.10) 

where p z , z (D v ) is the truncation of the coset D y from time-x to time-x. For each 
D z 6 S z {D y ), there is exactly one coset in p 2 ., ; (C )/C' r , ; , denoted adj(£) z , £>,,), 
such that D z o adj (D z ,D y ) C D y (see Figure 11.:). Then, 

D y = U D I OAd](D z ,D y ). (11.11) 

D.€S,(£>„) 

From (11.11) we see that the metric of D y an be computed from metrics 
of cosets in p z , z (C)/C‘% and cosets in p x . y (C)/Cl\ y (or from tables CBT ZZ 
and CBTj.y) once the set S z {D y ) and adj (D z ,L y ) for each D z <= S z {D y ) are 
identified. The special two-section trellis to be constructed is simply to display 
the relationship given by (11.11) and identify the set S z (D y ) for each coset 
Dy 6 Pz.„(C)/C“ r 

Let and E y denote the state spaces of the special two-section trellis for 
Px,y{C) at time-x and time-y, respectively. To achieve the purpose as described 
above, the special two-section trellis for pz, y (C) nust have the following struc- 
tural properties: 

(1) There is an initial state, denoted ffx.o at ime-x. 
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(2) There is a one-to-one correspondence between the states in the state 
space E z and the cosets in p*, z (C)/Ci%. Let D z denote a coset in 
Px,x{C)/C; M and cr(D z ) denote its corresponding state at time-z. Then 
the composite branch label between & Zy o and <r(D z ) is L{ff z ^,(r{D z )) = 

D t . 

(3) There is a one-to-one correspondence between the states in the state 
space E y and the cosets p«, y (C)/C£' y . Let D y denote a coset in p*, y (C)/ 

C‘ r , and cr(D y ) denote its corresponding state at time-y. For any state 

G E y , L(c(D z )MD,,)) = adj(D z , D v ) if D z G S z {D y ). Other- 
wise, L{a(D z ),a(D v )) = 0. 

From the structural properties of the above special two-section trellis, we see 
that: (1) For every state <r{D z ) G E z , its (state) metric m(D z ) is given in the 
table CBT IiZ ; and (2) For each composite branch between a state <r(D z ) at 
time-z and an adjacent state cr{D y ) at time-y, its composite branch metric, 
m(<r(D z ),a(D y )), is given in the table CBT z . y . 

It follows from (11.11) and the structural properties of the above special 
two-section trellis for p*, y (C) that for each coset D y G Pi, y (C)/Ci r y , the metric 
m(D y ) is given by 

m(D y ) = max {m(D z ) + m(cr(£> z ),o-(f? y ))}, (11-12) 

D,€S,(D„) 

where the set of states correspond to S X (D V ) and the state pairs (<r(Z? z ), <r(Z? y )) 
can be easily identified from the special two-section trellis. Eq. (1 1.12) is simply 
equivalent to (11.3). Therefore, CombCBT(x,y; z) will be designed to compute 
the metrics for the table CBT x , y based on (11.12) using the special two-section 
trellis. In general, this special trellis is much simpler than the section of the 
entire code trellis T from time-x to time-y except for the cases where x = 0 or 
y = N, and is much easier to construct. As a result, the construction of the 
metric table CBT I>y is much simpler. 

The construction of the above special two-section trellis for p z . y {C ) is done 
as follows: Choose a basis • ■ • > w fc(r>«, z (C))} °f Pz.y(^) suc h that the first 

fc(C* r y ) = k(C z , y ) vectors form a basis of C*£ y . Define 


n z,y — y ~ x + MPx, y (f?)) — 


(11.13) 
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Let G(x,y) be the following fc(p*, y (C)) x n Xt „ matrix: 


G(x,y) 


Vl 

( 

V >*(Cr, v ) 

v HC t , y )+ 1 

: 2 

V *=(Px.»(C)) 


where 0 denotes the fc(C Ziy ) x (fc(p*, y (C)) - fc(C x , y )) all-zero matrix, and I 
denotes the identity matrix of dimension (k(Px, y (C)) — fc(C Xiy )). Let C(x,y) 
be the binary linear code of length n XiJ/ generated by G(x,y). Construct a 3- 
section trellis diagram T({x, z, y, x + n Xyy }) for C(x,y) with section boundaries 
at times x, z, y and x + n ZiTV as shown in Figure 11.3. Then the first two 
sections of T({x,z,y,x + n x , v }) gi V€ the desired special two-section trellis for 
computing (11.12). 

In fact from (11.12) and the properties of the special two-section trellis for 
Px. y (C), we only need the second section of T({x z,y,x + n I>y }) to construct 
the table CBT X>?/ . For convenience, we denote this special one-section trellis 
with T x (z,y). Table CBT X ,, gives the state metrics of T x (z,y) at time-z and 
Table CBT*, y gives the composite branch metrics of T x (z,y) between time-z 
and time-y. Therefore, the implementation of the RMLD algorithm does not 
require the construction of the code trellis T for the entire code C, it only 
requires the construction of the special one-section trellises, one for each recur- 
sion step. Each of these special one-section trellises has the minimum (state 
and branch) complexity for constructing a compos te branch metric table using 
the CombCBT procedure. This reduces decoding complexity considerably. 


Example 11.1 Consider the RM code C = RM 2 4 given in Example 6.3. Let 
x = 4, y = 12, and z = 8. Then, p x , y (C) = RM 2i3 , C^ y = RM 1(3 , C‘% = 
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2 fc(p...(C))-fc(C. t ,) 

states 


initial state, 


2 p*{Cm. v ) stati 
in T, z (cr(Dy) 


Figure 11.! 


C l z \ y = RM 0 ) 2, an 


It can be put in 
minimal trellis c 
One of the com 
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time z= 8 


time y=12 



Pi ± {0000,1 111}, P 2 = {001 .,1100} 

P 3 ^ {0101, 1010}, P 4 = {Oil 1 ), 1001} 


Figure 11.4. A parallel component of T 4 ( 8, 12) for the RM 2)4 code, 
by adding (0,0,0, 1) to each branch label. 

AA 

From (11.12), we see that the computation cf the composite branch met- 
ric m(Dy) depends on the size of the set S z (D y )• Since for a coset D v € 


208 TRELLISES AND TRELLIS-BASED DECODING ALGORITHMS FOR LINEAR BLOCK CODES 


Pz,y( C )/ C z,y> the truncation p x , x {D y ) is a union of cosets in il 

follows from property (3) of the special two-section trellis that for every state 
cr(Dy) € E„, the number of composite branches merging into the state <r(D y ) 

is 


\S.(D,)\ 


\Dy\ l C »-vl 

\D z \-\H*(D')MDy))\ |C£.|-|C*g 


which is exactly the same as (11.5). From (11.14), we can readily determine 
the number of computation operations required to compute m(D y ). 

Next we need to devise efficient methods to solve (11.12) using the one- 
section trellis T*(x,y) so that either the computational complexity of the 
CombCBT procedure is reduced or the circuit requirement and chip size of 
IC implementation of the CombCBT procedure are reduced. Two methods 
for solving (11.12) will be presented in the next two sections and they re- 
sult in two specific CombCBT procedures, named the CombCBT-V and the 
CombCBT-U procedures. 


11.5 COMBCBT -V(X, Y\ Z) PROCEDURE 

A straightforward procedure to solve (11.12) based on the one-section trel- 
lis T x (z,y) is to apply the conventional add-compare-select (ACS) procedure 
that is used in the conventional Viterbi algorithm. For each coset D y in 
P*A C )/ C l \ y t the metric sum, m(D z ) + m{cr(D z ) y <r{D y )) y is computed for ev- 
ery state <j{D z ) with D z G S z {D y ), and m{D y ) is found by comparing all 
the computed metric sums. This procedure is called the CombCBT-V(x, y; z) 
procedure, where V stands for Viterbi algorithm. 

Since the Viterbi algorithm is applied to a one-section trellis diagram to 
construct a composite branch metric table from two smaller tables, the IC 
implementation of the CombCBT-V procedure is quite simple and straightfor- 
ward. 

Let t/>^(x,y; z ) denote the value of z ) f° r CombCBT-V(x, y; z) 

procedure. Note that the number of states at time-y in T x (z,y) is 
2 fc (p*, »(£))-*(<?*. an d for each state c(D y ) at time-y, the number of states 
<r(D z ) at time-z in T r (z,y) which are adjacent to cr(D y ) is given by (11.14). 
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Let 

P.(C X ,y) = k(C x ,y) - k(C x , x ) ~ k(C t ,y). (11.15) 

Then, the total number of additions and that of comparisons executed by the 

CombCBT-V(x,y; 2 ) procedure are 

2 Mp,.,(C))-fc(C,,,)+p,(C,.,) and ■» ) - 1), 

respectively. Consequently, the computational complexity of the CombCBT-V(x, 
y; 2 ) procedure is given by 

V4 V) (x,y;x) = 2 l ('’* »( c »- fc(c * » ) (2 p * (c * * )+l - !)• (H-16) 


11.6 RMLD-(I.V) AND RMLD-(G.V) ALGORITHMS 

Combining the CombCBT-V procedure with either the MakeCBT-I proce- 
dure or the MakeCBT-G procedure, we obtain two specific RMLD algorithms, 
denoted RMLD-(I,V) and RMLD-(G,V). From (11.6), (11.7), (11.8) and 
(11.16), we can compute the total number of addition-equivalent operations 
required by the RMLD-(I,V) algorithm for decoding a received word. The com- 
putational complexity of the RMLD-(G,V) alg< rithm can be computed from 
(11.6), (11.7), (11.9) and (11.16). 

For either the RMLD-(I,V) algorithm or the RMLD-(G,V) algorithm, we 
need to know for what value of y - x that the CombCBT-V procedure should 
be executed to construct the table CBT It „. This is answered by the following 
two theorems. We simply state the theorems here without the proofs which 
can be found in [ill]. 

Theorem 11.1 Consider a binary linear code C of length N such that the 
minimum Hamming distances of C and its dual code are both greater than 
one. 

(i) If y - x > 2, then for any 2 with x < z < y, the CombCBT-V(x, y; z) 
procedure requires less computation to form the metric table CBT IiS , 
than the MakeCBT-I(x.y) procedure. II y - x = 2, the complexities of 
CombCBT-V(x, y; x + 1) and that of M;JreCBT-I(x,y) are the same. 
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(ii) If y-x > 2, k(p z , v (C)) = y-x and where 

0y_! denotes the all-zero vector of length y—x and (1, 1) denotes 

a vector of length y — x such that the first and the last components 
are 1, then the right-hand side of (11.6) takes its minimum for both 
z = [(x + y)/2j and z = f(x + y)/2]. AA 

Theorem 11.1 simply says that for y - x > 2, procedure CombCBT-V(x,y; z) 
should be used to construct the metric table CBT r , y in the RMLD-(I,V) algo- 
rithm. 

Theorem 11.2 Consider a binary linear code C of length N such that the 
minimum Hamming distance of C and its dual code are both greater than one. 
For the RMLD-(G,V) algorithm, 

(i) If fc(Pz,»(C)) = y - x and C* y = {0} or {O y _«,(l, 1)}, then 

the MakeCBT-G(x,y) procedure requires less computation than the 
CBT(x, y;z) procedure for any z with x < z < y to form the metric 
table CBTx.y for y - x > 2. When y-x = 2, they are the same. 

(ii) If the conditions of (i) do not hold, then the CombCBT-V(z, y, z) pro- 

cedure with some z is more efficient than the MakeCBT-G(z, y) for 
constructing the metric table CBT r , y for y — x > 2. Moreover, if 
k( Pl . y (C)) < y-i and C‘% = {0} or {()„_*, (1, 1)}, then the 

right-hand side of (11.6) takes its minimum for both z = [(* + y)/2J 
and z = f(x + y)/2). 

Since the Viterbi algorithm is applied to a one-section trellis diagram to 
construct a composite branch metric table from two smaller tables, the IC im- 
plementations of both the RMLD-(I,V) and RMLD-(G,V) algorithms are quite 
simple and straightforward. For high speed decoders, the MakeCBT-I proce- 
dure is more suitable than the MakeCBT-G procedure, since branch metrics 
can be computed in parallel. As shown in Theorem 11.1, in the optimum sec- 
tionalization, the value of y — x for the MakeCBT-I procedure to be executed 
can be kept equal to 2, but this is not necessarily the case for the MakeCBT-G 
procedure (see Theorem 11.2). Hence, IC implementation of the MakeCBT-I 
procedure is easier. Furthermore, with the MakeCBT-G procedure, the metrics 
must be computed serially. 
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11.7 COMBCBT-U(X, Y\ Z) PROCEDURE 

This procedure is based on the decomposition of the one-section trellis T x (z , y) 
into simple uniform subtrellises as described in Section 6.4. The one-section 
trellis T x (z,y ) may consist of parallel isomorphic components. These parallel 
components can be partitioned into groups of the same size in such a way that: 
(1) two parallel components in the same group are identical up to path labeling; 
and (2) two parallel components in two different groups do not have any path 
label in common [44]. Each group consists of 2 X identical parallel components, 
where A can be computed from (6.36) with C(x y y) as the code. 

Furthermore, each parallel component of T x {z y y) can be decomposed into 
subtrellises with simple uniform structures as shown in Figure 11.5 by applying 
Theorem 3 of [44] (also see Section 6.3) to the code C(x, y) that was used for 
constructing the one-section trellis T x (z,y). Consider a parallel component A. 
The state spaces at the two ends of the parallel component can be partitioned 
into blocks of the same size 2*\ called left U-blocks and right U-blocks, 
respectively, where v can be computed from (6.41) with C replaced by C(x, y). 

A pair of a left U-block and a right U-block is called a U-block pair, and 
each U-block pair (B Z ,B V ) has the following uniform structure, denoted U: 
For any two states a y and <j' y in B, n 

{L(cr z ,< 7 V ) : a z G B z } = {L(<t z ,c' v ) : o z € B z }. (11-17) 

The above property simply says that for a U-tlock pair ( B zy B y ), the set of 
composite branches from the states in the left U-block B z to any state in the 
right U-block B y is the same. This property can be used in solving (11.12) 
to reduce the computational complexity. The label set of composite branches 
defined by (11.17) is called the composite branch label set of the U-block 
pair ( B z ,B y ). Two different U-block pairs have mutually disjoint composite 
branch label sets. 

The CombCBT-U(x, y; z) procedure is devistd based on the uniform struc- 
ture of a U-block pair. In contrast to the ConbCBT -V(x,y;z) which solves 
(11.12) independently for every state <r{D y with D y € Px, y {C)/C^ T y , 
CombCBT-U(x, y; z) solves (11.12) simultaneously for each U-block pair (B z> 
By) of a parallel component of T x (z y y) by taking into account of the uniform 
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*£P* ( C z . y ) 

states 


A U-block pair and 
branches between them 



Time 2 


y 


Figure 11.5. The left U-blocks and right U-blocks of a parallel component in T x (z,y). 
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property U given by (11.17). For a parallel component A of T’,(z > y) 1 let LU(A) 
denote the set of left U-blocks in A. Based on (11.17), (11.12) can be put in 
the following form: For a state <r{D y ) in a right U-block B y , 

msADy) = max {m(D z ) + m(cr(D z I, <r(D y ))} y for B z e LU(A), 

D # € { D M :<r(D t )€B M } 

(11.18) 

m(D y ) = max m Bl (O f ). (11.19) 

a, €.LU{\) 

Equations (11.18) and (11.19) show that (11.12) can be solved simultaneously 
for each U-block pair. This allows parallel processing to speed up the compu- 
tation. In fact the computations of (11.18) and (11.19) can be carried out for 
all the parallel components of T x (z, y) in parallel 

For easy understanding, an example is used tc explain how to solve (11.18) 
for each U-block pair (B x ,B y ). 

Example 11.2 Again consider the RM code C = RM 2<4 . As shown in Ex- 
ample 11.1, the one-section trellis diagram T x (z,y) with x = 4, y = 12 and 
z = 8 consists of two four-state parallel components. From (6.36) and (6.41), 
we find that A = 0 and v = 2. Therefore, the two parallel components are not 
identical, and each consists of only one left U-block and one right U-block. As 
shown in Figure 11.4, the four end states of one parallel component at time-8, 
denoted a(D { z l) ) y <j(D { z 2) ), a[D { z 3) ), a[D { z ) ) y form a single left U-block, and the 
4 end states at time-12, denoted ^(D^), <7(D* 2) ), cr(D y 3) ), <r(Dy 4} ), form a 
single right U-block. There are four different composite branch labels between 
them, denoted 

Pi k {0000, 1111}, P 2 = {0011,1100}, 

P 3 = {0101, 1010}, P 4 = {0 TO, 1001}. 

The set of the composite branch labels merging into any state cr(By ; *) at time- 
12 is {Pi,P 2 ,P 3 ,P 4 }. 

From (11.18) and (11.19), the largest metric, denoted m(By^), for the coset 
Dy^ with 1 < j < 4 is given by 


m(£>^) = max{m(B^) + m(P b(t , j) )}, 

l<t<4 


( 11 . 20 ) 
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where 6(i, j) is the unique integer such that L(a(D[ 

To compute the metrics m(D£ j) ) for 1 < ; < 4, form the following set of 
metric sums: 


M ± {m(D<°) + m(P 6 ) : 1 < i < 4, 1 < b < 4}. 

Each sum in M is associated with a path in the trellis of Figure 11.4. Clearly 
the largest sum in M corresponds to the survivor path for the associated state 
<r(D { y q) ) at time-y., i.e., m(D { y q) ) is equal to the largest sum in M. Thus this 
value for the coset D ( y q) is entered in Table CBT XtIr This can be proceeded by 
examining the second, third, ♦ . . largest sum in M. If the j-th largest sum Mj 
corresponds to state v{D { y q) ), and CBT Xiy contains no entry for the coset D { y q \ 
then Mj is entered in CBT xy . This process continues until CBT Z) y contains 
entries for each D y \ 

Similarly, the metrics of cosets that correspond to the four states at time- 
12 in the other parallel component in the one-section trellis T x (z,y) can be 
computed. This completes the construction of table CBT x . y . 

We can find the j-th largest sum of M more efficiently by pre-sorting 

{m{D[ l) ) : 1 < i < 4} and {m(P 6 ) : 1 < b < 4}. 

AA 

In general, for a U-block pair (B t ,B y ) with B t = {<r(Di 1) ),<T(£>i 2) ), . . . , 
<t(D< 2 ‘' ) )}, By = {a(D< ( l, ),(T(Z)i 2) ),...,ff(D‘ ( 2 ‘' ) )}, and the composite branch 
label set of B v ), {Pi.Pj, ■ • ■ , ft->, (H-18) is solved in the following way: 

(51) Sort m{D { t l) ), m(M 2) ), .... m^ 2 ”’) in the decreasing order. 

(52) Sort m(Pi), m(P 2 ), . . m(P 2 -) in the decreasing order. 

(53) Form M ± {m{D[ x) ) + m(P b ) : 1 < i < 2", 1 < b < 2*'}. Determine 
m B,(Dy ;) ) w ‘th 1 < ] < 1 V as described in Example 11.2 by using the 
following partial ordering on M : 

+ Tn(Pb) > m(D^) + m(Pt'), 

if m ( d ?) > m and m(Pb) > m(Pv). 
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Clearly the above procedure for a U-block pair can be executed for all the 
U-block pairs in all the distinct parallel components in the one-section trellis 
T x (z,y) simultaneously. 

Note that the CombCBT-U(x, y; z) procedure is identical to the CombCBT- 
V(x,y; z ) procedure only for the case of v = 0 (the trivial case in which each 
left U-block and right U-block consist of a single state). 

Let ^c\x,y\ z) denote the number of addition-equivalent operations of 
CombCBT-U(x,y; z). The computational complexity for solving (11.18) de- 
pends on the received sequence. In the following, an upper bound on (x, y; z) 
for the worst case is given, which is independent of the received sequence. With- 
out derivation, the bound is given below (37): 


4 >c ) (z.y;*) < 


where 


*7(2*') = 


*'^ c H" fc l c * i'l((l -f -L)2 />f ^ c * »l — 1) (11.21) 


i/, for v = 0, 1, 

2(u — 1){2 U — 1) — 1, otherwise. 


( 11 . 22 ) 


Let (x, y; z) denote the upper bound given by .he right-hand side of (11.21). 
It can be shown that [37] 


V’c /> (*.y;2) < 


C 


(*.!»;*)• 


(11.23) 


The inequality of (11.23) holds for v > 2. As v becomes large, the ratio 

V’cV.y; z)/'l> { c ) ( x ,y- z ) 


decreases rapidly. This says that the CombCBT-U procedure is more efficient 
than the CombCBT-V procedure computation-w se. 


11.8 RMLD-(G t U) ALGORITHM 

The CombCBT-U procedure can be combined with either the MakeCBT-I pro- 
cedure or the MakeCBT-G procedure to form specific RMLD algorithms. Since 
the MakeCBT-G procedure requires less compu :ational complexity than the 
MakeCBT-I procedure. The MakeCBT-G procedure and CombCBT-U pro- 
cedure are combined to form an RMLD algorit am, called the RMLD-(G,U) 
algorithm. 



2 16 TRELLISES AND TRELLIS-BASED DECODING ALGORITHMS FOR LINEAR BLOCK CODES 


Table 11.1. Numberi of addition-equivalent operations with various maximum likelihood 
decoding algorithms for some RM and extended BCH codes of length 64. 


Code, (Basis) 

64-section 

RMLD-(I,V) 

RMLD-(G.V) 

RMLD-(G,U) 
upper bound 

~ mm 

Lafourc&de 
It Vardy 
[60] 

RMj.o(64,22) 

425,209 

78,209 

77,896 

66,824 

101,786 

RMi. 0 (64,42) 

773,881 

326,017 

323,759 

210,671 

538,799 

RM 4 .s(64,57) 

7,529 

5,281 

4,999 

4,087 

6,507 

EBCH(64, 10), (C) 

20,073 

3,201 

3,108 

3,108 

4,074 

EBCH(64, 16), (B) 

764,153 

120, 193 

119,880 

96,840 

148,566 

EBCH(64, 18), (B)l 

2,865,401 

468,040 

468,040 

372,808 

509,120 

EBCH(64,24), (B) 

1,327,353 

271,745 

271,432 

171,823 

316,608 

EBCH(64, 30), (C) 

35,028,985 

16,091,009 

16,056,668 

9,408,567 

16,598,063 

EBCH(64,36), (C) 

18,710,521 

9,995,617 

9,961,580 

7,684,276 

12,829,263 

EBCH(64,39), (C) 

38,436,857 

24,741,161 

24,707,149 

19,841,161 

30,982,731 

EBCH(64, 45), (C) 

1,082,105 

893,489 

891,695 

665,713 

891,819 

EBCH(64, 51), (A) 

418,553 

312,721 

312,382 

257,300 

393,528 


From (11.6), (11.7), (11.9) and (11.21), we can compute an upper bound, 
denoted on the worst-case computational complexity of the RMLD- 

(G,U) algorithm for decoding a received word. 

11.9 COMPARISONS 

Among the three specific RMLD algorithms, RMLD-(I,V), RMLD-(G,V) and 
RMLD-(G.U), the RMLD-(G.U) algorithm is the most efficient one computation- 
wise, while the RMLD- (I, V) algorithm is the simplest for IC implementation. 

In the following, the three specific RMLD algorithms are applied to some well 
known codes of length 64 to show their effectiveness in terms of computational 
complexity. 

Let EBCH(64, fc) denote the extended code obtained from the binary prim- 
itive (63, fc) BCH code by adding an overall parity bit. The computational 
complexities of decoding the RM r ,g codes with 2 < r < 4 and the permuted 
EBCH(64,fc) codes with 10 < fc < 51 are computed based on certain symbol 
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position permutations and given in Table 11.1. Hereafter, RM riTa is denoted 
RM r , m (2 m , Sr=o(T)) s ^ ow fc he number of information bits explicitly. 

To reduce the state complexities of the trellis diagrams for EBCH(64,fc) 
codes, the order of symbol positions must be permuted. For the RM codes, the 
natural symbol ordering is optimal for the state complexity [45]. For the EBCH 
codes, consider the following permutations n [46]. Let a be a primitive element 
of GF(2 6 ) and {/3i, . . . ,/9$} a basis of GF(2 C ) over GF(2). For a positive integer 
i less than 2 fl , let a*' 1 be expressed as a 1-1 = Sy=i bijPj, with bij *6 GF(2). 
For t = 0, let &o,j — 0 for 1 < j < 6. Then n is the following permutation 
on {1,2,. ...2®}, tt(j) ^ 1 + £‘ =1 6 < _ 1 j2 # ->, for 1 < i < 2®. Consider the 
following three bases for codes of length 64: (1) Basis A is the polynomial basis, 
{l,a,a 2 ,... ,a 5 }; (2) Basis B is {l, a, a 2 , a 21 , a 22 , a 23 }, which is obtained by 
combining a basis of GF(2 6 ) over GF(2 2 ), {l,a,a 2 }, and a basis of GF(2 2 ) 
over GF(2), {l,a 21 }; (3) Basis C is {l,a,a 9 ,a H ,a u ,a lv }, which is obtained 
by combining a basis of GF(2 6 ) over GF(2 3 ), {1, *}, and a basis of GF(2 3 ) over 
GF(2), {l, a 9 , a 18 }. 

Table 11.1 gives the total numbers of addition-equivalent operations re- 
quired by the three specific RMLD algorithms, RMLD-(I,V), RMLD-(G,V) 
and RMLD-(G,U), for decoding the above code;. For the RMLD-(G,U) algo- 
rithm, only the values of the upper bound ip m { n * on the worst-case computa- 
tional complexity are given. For comparison purpose, the numbers of addition- 
equivalent operations required for decoding the above codes with the Viterbi 
decoding algorithm based on optimum sec tionalnation presented in Section 10.2 
(Lafourcade and Vardy algorithm) are also included. The column labeled 64- 
section gives the numbers of operations required in the conventional Viterbi 
decoding based on the bit-level 64-section minimal trellis diagram. 

For all EBCH codes, other than the EBCH(6<:,51) code, the symbol permu- 
tations indicated in Table 11.1 give the smallest ootimum values for each column 
among the three symbol permutations given above. For EBCH(64,51), Basis 
B gives the smallest number of operations requiied in the conventional Viterbi 
decoding based on the 64-section trellis diagram imong the three permutations, 
but Basis A gives the smallest values for the other columns. This shows that a 
good bit ordering for the iV-section trellis diagram is not always good for the 
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proposed RMLD decoding procedures. The last column in Table 11.1 shows the 
numbers of addition-equivalent operations given by Lafourcade and Vardy [60]. 

Table 11.1 shows that the RMLD-(G.U) algorithm is the most efficient 
trellis-based decoding algorithm even in terms of the worst case computational 
complexity, and the difference between the computational complexities of the 
RMLD-(I,V) and RMLD-(G,V) algorithms is very small. All three RMLD 
algorithms are more efficient than the Viterbi decoding algorithm based on 
optimum sectionalization [60], except only for the RMLD-(I,V) algorithm for 
the EBCH(64, 45) code. For each algorithm, the number of basic operations 
executed by the MakeCBT procedure is relatively small compared with that ex- 
ecuted by the CombCBT procedure. Consider the EBCH(64,45) code. Decod- 
ing this code with the RMLD-(I,V) algorithm, the number of basic operations 
executed by the MakeCBT-I procedure is 108 out of a total of 893,489 basic 
operations. Using the RMLD-(G,U) algorithm, the MakeCBT-G procedure 
executes 818 basic operations out of a total 665,713 basic operations. 

Let < x,y > denote the MakeCBT-G(x.y) operation, and let • denote the 
CombCBT-U operation. The optimum trellis sectionalizations for RM 3 .6(64, 42) 
and EBCH(64, 24) with Basis B for the complexity measure t/> nlin of the 
RMLD-(G.U) algorithm are identical, and represented as 

((<0,8 > • < 8, 16 >) • (< 16,24 > ■ < 24,32 >))• 

((< 32,40 > • < 40,48 >) • (< 48,56 > • < 56,64 >)). 

The optimum trellis sectionalization for RM 2> «(64,22) for * is 

((((((< 0,8 > • < 8, 16 >) • (< 16, 24 > • < 24,32 >)) 

.(< 32,40 > • < 40,48 >))• < 48,56 >)• < 56,61 >) 

■ < 61,63 >)• < 63,64 >, 

and that for EBCH(64, 30) with Basis C is 

(< 0, 16 > • < 16,32 >)•(< 32,48 > • < 48, 64 >). 

The optimum trellis sectionalization for a code using the algorithm RMLD 
is generally not unique. The above optimum trellis sectionalizations are cho- 
sen in the following manner: If rp m \ a [x % y) = 4 >< M\ x iy)> then execute the 
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Table 11.2. Average numbers of addition-equivalent operations using the RMLD-(G.U) 
algorithm for the RM3 t «(64,42) and EBCH(64,24) codes. 




Upper bound 


210,671 



on the worst case complexity 

RMj,o(64,42) 

RMLD-(G,U) 


ldB 

66,722 


The average number of 

4dB 

66,016 



operations rl>^ V) 

7dB 

63,573 




10dB 

61,724 



Upper bound 


171,823 



on the worst case complexity 

EBCH(64, 24), 

RMLD-(G,U) 


ldB 

70,676 

Basis B 

The average number of 

4dB 

70,420 


; i 

operations 4> { ^ a V) 

7dB 

69,325 




10dB 

68, 158 


MakeCBT-G(x, y) procedure. Otherwise, the CombCBT-U(x, y; z) procedure 
is executed for an integer z such that i/> m in( x *y) = V’minl*,*) + ^ m i u (r,y) + 
y\ z) and \z — (x + y)/ 2| are the smallest. 

Using the trellis sectionalizations which are optimum with respect to the 
measure we evaluate the average values of V>min(0, N), denoted x 

for the RMLD-(G,U) algorithm which are giver in Table 11.2. It is assumed 
that BPSK modulation is used on an AWGN channel. The average values at 
the SNRs per information bit, 1 , 4, 7 and 10 (dB , are listed in the rows labeled 
f° r RM 3f6 (64, 42) and EBCH(64, 24). We see that these values vary only 
slightly and are much smaller than the worst-case upper bound . 



